library(brms)
library(ggplot2)
library(kableExtra)
library(lme4)
library(lmerTest)
library(rmarkdown)
library(performance)
library(see)
library(sjmisc)
library(tidyverse)
options(
digits = 3
)
set.seed(170819)Analyses with words as outcome
Set-up
Packages
Custom functions
# function to silence brms output
hush <-
function(
code
){
sink("/dev/null")
tmp = code
sink()
return(tmp)
}Data
d_words <- read_csv("data/DataAggregated_T1T2_nwords.csv")
d <- read_csv("data/data.csv")
# get words from dataframe
d$n_Words <- d_words$n_Words
# same as above; but original file name:
# d <- read_csv("data/DataAggregated_T1T2_costsbenefits.csv")
# load image for work in IDE
# load("data/image_words.RData")
d <- d |>
rename(
group = roles,
gender = DE01_T1,
age = DE02_01_T1,
pol_stance = DE06_01_T1
) |>
mutate(
female = as.logical(2 - gender),
gender = factor(gender, labels = c("female", "male")),
n_Words = replace_na(n_Words, 0)
)
# recode to make as sum coding
d$anonymity_dev <- factor(d$anonymity)
contrasts(d$anonymity_dev) <- contr.sum(2)
d$persistence_dev <- factor(d$persistence)
contrasts(d$persistence_dev) <- contr.sum(2)Descriptives
Let’s inspect distribution of words communicated.
ggplot(d, aes(n_Words)) +
geom_histogram(binwidth = 50)Looks like a zero-inflated poisson distribution. Confirms our preregistered approach to analyze data using zero-inflated Poisson approach.
nrow(d[d$n_Words == 0, ]) / nrow(d)[1] 0.206
Overall, 21% of participants without any written words.
We first look at the experimental group’s descriptives
d |>
group_by(persistence) |>
summarize(n_Words_m = mean(n_Words)) |>
as.data.frame() |>
kable()| persistence | n_Words_m |
|---|---|
| high | 365 |
| low | 422 |
Looking at persistence, we see there’s a slight difference among groups. Participants in the low persistence condition communicated more.
d |>
group_by(anonymity) |>
summarize(n_Words_m = mean(n_Words)) |>
as.data.frame() |>
kable()| anonymity | n_Words_m |
|---|---|
| high | 376 |
| low | 411 |
People who with less anonymity communicated more. But the difference isn’t particularly large.
d |>
group_by(persistence, anonymity) |>
summarize(n_Words_m = mean(n_Words)) |>
as.data.frame() |>
kable()`summarise()` has grouped output by 'persistence'. You can override using the
`.groups` argument.
| persistence | anonymity | n_Words_m |
|---|---|---|
| high | high | 356 |
| high | low | 375 |
| low | high | 397 |
| low | low | 448 |
Looking at both groups combined, we see that low anonymity and low persistence created highest participation.
d |>
group_by(group) |>
summarize(
anonymity = anonymity[1],
persistence = persistence[1],
topic = topic[1],
n_Words_m = mean(n_Words)
) |>
rmarkdown::paged_table()Looking at the various individual groups, we do see some difference. Generally, this shows that communication varied across groups.
d |>
group_by(topic) |>
summarize(n_Words_m = mean(n_Words)) |>
as.data.frame() |>
kable()| topic | n_Words_m |
|---|---|
| climate | 388 |
| gender | 368 |
| migration | 426 |
Looking at topics specifically, we also see that there’s some variance.
Bayesian mixed effects modeling
We analyze the data using Bayesian modelling.
We use deviation/sum contrast coding (-.1, .1). Meaning, contrasts measure main effects of independent variables.
Fixed effects
We preregistered to analyze fixed effects.
fit_fe_1 <-
hush(
brm(
n_Words ~
1 + persistence_dev * anonymity_dev + age + female + pol_stance +
(1 | topic/group)
, data = d
, chains = 4
, cores = 4
, iter = 6000
, warmup = 2000
, family = zero_inflated_poisson()
, control = list(
adapt_delta = .95
, max_treedepth = 12
)
, save_pars = save_pars(all = TRUE)
, silent = 2
)
)Warning: There were 559 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
Warning: There were 237 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 12. See
https://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
Warning: Examine the pairs() plot to diagnose sampling problems
Shows some convergence warnings. Let’s inspect model.
plot(fit_fe_1, ask = FALSE)Trace-plots look alright.
Let’s look at results.
summary(fit_fe_1)Warning: There were 559 divergent transitions after warmup. Increasing
adapt_delta above 0.95 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
Family: zero_inflated_poisson
Links: mu = log; zi = identity
Formula: n_Words ~ 1 + persistence_dev * anonymity_dev + age + female + pol_stance + (1 | topic/group)
Data: d (Number of observations: 960)
Draws: 4 chains, each with iter = 6000; warmup = 2000; thin = 1;
total post-warmup draws = 16000
Multilevel Hyperparameters:
~topic (Number of levels: 3)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.15 0.17 0.00 0.63 1.00 2715 4759
~topic:group (Number of levels: 48)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.37 0.04 0.30 0.45 1.00 2772 4125
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat
Intercept 5.78 0.11 5.55 6.02 1.00
persistence_dev1 -0.04 0.05 -0.14 0.07 1.00
anonymity_dev1 -0.05 0.05 -0.16 0.05 1.00
age 0.01 0.00 0.01 0.01 1.00
femaleTRUE -0.13 0.00 -0.13 -0.12 1.00
pol_stance -0.02 0.00 -0.02 -0.02 1.00
persistence_dev1:anonymity_dev1 0.05 0.05 -0.06 0.15 1.00
Bulk_ESS Tail_ESS
Intercept 4283 4646
persistence_dev1 3303 4336
anonymity_dev1 3106 4651
age 16884 12262
femaleTRUE 10329 7781
pol_stance 12176 8724
persistence_dev1:anonymity_dev1 3269 3896
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
zi 0.21 0.01 0.18 0.23 1.00 8989 7641
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
No significant effect emerged.
Let’s inspect ICC
var_ratio_fe <- performance::variance_decomposition(
fit_fe_1
, by_group = TRUE)
var_ratio_fe# Random Effect Variances and ICC
Conditioned on: all random effects
## Variance Ratio (comparable to ICC)
Ratio: 0.45 CI 95%: [0.12 0.66]
## Variances of Posterior Predicted Distribution
Conditioned on fixed effects: 43442.51 CI 95%: [27273.26 70002.16]
Conditioned on rand. effects: 79718.68 CI 95%: [74074.94 85167.06]
## Difference in Variances
Difference: 36192.27 CI 95%: [9511.86 52872.17]
45.485 percent of variance in words communicated explained by both topics and groups.
Let’s visualize results to see what they exactly mean.
p <- plot(
conditional_effects(
fit_fe_1
),
ask = FALSE,
plot = FALSE
)
p_anon <-
p[["anonymity_dev"]] +
xlab("Anonymity") +
ylab("Words") +
scale_x_discrete(
limits = rev
)
# ) +
# scale_y_continuous(
# limits = c(5, 14)
# , breaks = c(6, 8, 10, 12, 14)
# )
p_pers <-
p[["persistence_dev"]] +
xlab("Persistence") +
ylab("Words") +
scale_x_discrete(
limits = rev
) +
# scale_y_continuous(
# limits = c(5, 14)
# , breaks = c(6, 8, 10, 12, 14)
# ) +
theme(
axis.title.y = element_blank()
)
p_int <-
p[["persistence_dev:anonymity_dev"]] +
xlab("Persistence") +
scale_x_discrete(
limits = rev
) +
scale_color_discrete(
labels = c("low", "high")
) +
guides(
fill = "none",
color = guide_legend(
title = "Anonymity"
)
) +
# scale_y_continuous(
# limits = c(5, 14)
# , breaks = c(6, 8, 10, 12, 14)
# ) +
theme(
axis.title.y = element_blank()
)
plot <- cowplot::plot_grid(
p_anon, p_pers, p_int,
labels = c('A', 'B', "C"),
nrow = 1,
rel_widths = c(2, 2, 3)
)
plotggsave("figures/results.png", plot, width = 8, height = 4)Shows that there are no main effects. There seems to be a (nonsignificant) interaction effect. In low persistence environment, anonymity is conducive to communication; in high it’s the opposite.
Let’s look at posteriors
p_1 <-
pp_check(fit_fe_1) +
labs(title = "Zero-inflated poisson")Using 10 posterior draws for ppc type 'dens_overlay' by default.
p_1The actual distribution cannot be precisely reproduced, but it’s also not too far off.
Random effects
We preregistered to explore and compare models with random effects. So let’s model how the experimental conditions affect the outcomes differently depending on topic.
fit_re_1 <-
hush(
brm(
n_Words ~
1 + persistence_dev * anonymity_dev + age + female + pol_stance +
(1 + persistence_dev * anonymity_dev | topic) +
(1 | topic:group)
, data = d
, chains = 4
, cores = 4
, iter = 6000
, warmup = 2000
, family = zero_inflated_poisson()
, control = list(
adapt_delta = .95
, max_treedepth = 15
)
, save_pars = save_pars(all = TRUE)
)
)Warning: There were 525 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
Warning: Examine the pairs() plot to diagnose sampling problems
Shows some convergence warnings.
Let’s inspect model.
plot(fit_re_1, ask = FALSE)Traceplots look alright.
Let’s look at results.
summary(fit_re_1)Warning: There were 525 divergent transitions after warmup. Increasing
adapt_delta above 0.95 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
Family: zero_inflated_poisson
Links: mu = log; zi = identity
Formula: n_Words ~ 1 + persistence_dev * anonymity_dev + age + female + pol_stance + (1 + persistence_dev * anonymity_dev | topic) + (1 | topic:group)
Data: d (Number of observations: 960)
Draws: 4 chains, each with iter = 6000; warmup = 2000; thin = 1;
total post-warmup draws = 16000
Multilevel Hyperparameters:
~topic (Number of levels: 3)
Estimate Est.Error
sd(Intercept) 0.26 0.41
sd(persistence_dev1) 0.30 0.45
sd(anonymity_dev1) 0.26 0.40
sd(persistence_dev1:anonymity_dev1) 0.38 0.48
cor(Intercept,persistence_dev1) -0.01 0.47
cor(Intercept,anonymity_dev1) -0.00 0.47
cor(persistence_dev1,anonymity_dev1) -0.01 0.47
cor(Intercept,persistence_dev1:anonymity_dev1) -0.04 0.46
cor(persistence_dev1,persistence_dev1:anonymity_dev1) 0.03 0.47
cor(anonymity_dev1,persistence_dev1:anonymity_dev1) -0.01 0.46
l-95% CI u-95% CI Rhat
sd(Intercept) 0.00 1.43 1.00
sd(persistence_dev1) 0.01 1.64 1.00
sd(anonymity_dev1) 0.00 1.42 1.00
sd(persistence_dev1:anonymity_dev1) 0.01 1.75 1.00
cor(Intercept,persistence_dev1) -0.85 0.83 1.00
cor(Intercept,anonymity_dev1) -0.83 0.84 1.00
cor(persistence_dev1,anonymity_dev1) -0.84 0.83 1.00
cor(Intercept,persistence_dev1:anonymity_dev1) -0.85 0.81 1.00
cor(persistence_dev1,persistence_dev1:anonymity_dev1) -0.82 0.85 1.00
cor(anonymity_dev1,persistence_dev1:anonymity_dev1) -0.83 0.83 1.00
Bulk_ESS Tail_ESS
sd(Intercept) 6069 6047
sd(persistence_dev1) 4910 6104
sd(anonymity_dev1) 5291 5949
sd(persistence_dev1:anonymity_dev1) 3597 4722
cor(Intercept,persistence_dev1) 20373 11843
cor(Intercept,anonymity_dev1) 22746 11797
cor(persistence_dev1,anonymity_dev1) 17202 12534
cor(Intercept,persistence_dev1:anonymity_dev1) 19402 12987
cor(persistence_dev1,persistence_dev1:anonymity_dev1) 16036 13418
cor(anonymity_dev1,persistence_dev1:anonymity_dev1) 12041 13726
~topic:group (Number of levels: 48)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.37 0.04 0.29 0.46 1.00 6304 9612
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat
Intercept 5.77 0.25 5.24 6.21 1.00
persistence_dev1 -0.04 0.27 -0.58 0.51 1.00
anonymity_dev1 -0.06 0.24 -0.59 0.37 1.00
age 0.01 0.00 0.01 0.01 1.00
femaleTRUE -0.12 0.00 -0.13 -0.12 1.00
pol_stance -0.02 0.00 -0.02 -0.02 1.00
persistence_dev1:anonymity_dev1 0.05 0.33 -0.65 0.75 1.00
Bulk_ESS Tail_ESS
Intercept 6778 4541
persistence_dev1 7308 5714
anonymity_dev1 7787 5343
age 15915 15238
femaleTRUE 22631 10396
pol_stance 16822 9683
persistence_dev1:anonymity_dev1 6387 4994
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
zi 0.21 0.01 0.18 0.23 1.00 31492 11056
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Again, no main or interaction effects.
Let’s see if the random effects model fits better
fit_fe_1 <- add_criterion(
fit_fe_1
, "kfold"
, K = 10
, cores = 4
)Fitting model 1 out of 10
Fitting model 2 out of 10
Fitting model 3 out of 10
Fitting model 4 out of 10
Fitting model 5 out of 10
Fitting model 6 out of 10
Fitting model 7 out of 10
Fitting model 8 out of 10
Fitting model 9 out of 10
Fitting model 10 out of 10
fit_re_1 <- add_criterion(
fit_re_1
, "kfold"
, K = 10
, cores = 4
)Fitting model 1 out of 10
Start sampling
Fitting model 2 out of 10
Start sampling
Fitting model 3 out of 10
Start sampling
Fitting model 4 out of 10
Start sampling
Fitting model 5 out of 10
Start sampling
Fitting model 6 out of 10
Start sampling
Fitting model 7 out of 10
Start sampling
Fitting model 8 out of 10
Start sampling
Fitting model 9 out of 10
Start sampling
Fitting model 10 out of 10
Start sampling
comp_1 <- loo_compare(fit_fe_1, fit_re_1, criterion = "kfold")
comp_1 elpd_diff se_diff
fit_re_1 0.0 0.0
fit_fe_1 -2842.7 2820.7
Although model comparisons showed that the model with random effects fitted better, the difference was not significant (Δ ELPD = -2842.75, 95% CI [-8371.271, 2685.78]. Hence, for reasons of parsimony the model with fixed effects is preferred.
Hurdle
Let’s now estimate a fixed effects model with hurdles.
fit_hrdl_1 <-
hush(
brm(
bf(
n_Words ~
1 + persistence_dev * anonymity_dev + age + female + pol_stance +
(1 | topic) +
(1 | topic:group),
zi ~
1 + persistence_dev * anonymity_dev + age + female + pol_stance +
(1 | topic) +
(1 | topic:group)
)
, data = d
, chains = 4
, cores = 4
, iter = 6000
, warmup = 2000
, family = zero_inflated_poisson()
, control = list(
adapt_delta = .95
, max_treedepth = 15
)
)
)Warning: There were 518 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
Warning: Examine the pairs() plot to diagnose sampling problems
Agian, some warnings.
Let’s inspect model.
plot(fit_hrdl_1, ask = FALSE)Trace-plots look alright.
summary(fit_hrdl_1)Warning: There were 518 divergent transitions after warmup. Increasing
adapt_delta above 0.95 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
Family: zero_inflated_poisson
Links: mu = log; zi = logit
Formula: n_Words ~ 1 + persistence_dev * anonymity_dev + age + female + pol_stance + (1 | topic) + (1 | topic:group)
zi ~ 1 + persistence_dev * anonymity_dev + age + female + pol_stance + (1 | topic) + (1 | topic:group)
Data: d (Number of observations: 960)
Draws: 4 chains, each with iter = 6000; warmup = 2000; thin = 1;
total post-warmup draws = 16000
Multilevel Hyperparameters:
~topic (Number of levels: 3)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.14 0.16 0.00 0.60 1.00 4317 6264
sd(zi_Intercept) 0.35 0.50 0.01 1.76 1.00 4375 5635
~topic:group (Number of levels: 48)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.37 0.04 0.30 0.46 1.00 4196 6563
sd(zi_Intercept) 0.24 0.14 0.02 0.53 1.00 4472 6744
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat
Intercept 5.78 0.11 5.56 6.01 1.00
zi_Intercept -1.87 0.51 -2.79 -0.84 1.00
persistence_dev1 -0.04 0.05 -0.14 0.07 1.00
anonymity_dev1 -0.05 0.05 -0.16 0.05 1.00
age 0.01 0.00 0.01 0.01 1.00
femaleTRUE -0.12 0.00 -0.13 -0.12 1.00
pol_stance -0.02 0.00 -0.02 -0.02 1.00
persistence_dev1:anonymity_dev1 0.04 0.05 -0.06 0.15 1.00
zi_persistence_dev1 0.05 0.09 -0.13 0.23 1.00
zi_anonymity_dev1 0.03 0.09 -0.14 0.21 1.00
zi_age 0.02 0.01 0.00 0.03 1.00
zi_femaleTRUE 0.14 0.18 -0.21 0.49 1.00
zi_pol_stance -0.05 0.04 -0.13 0.03 1.00
zi_persistence_dev1:anonymity_dev1 0.01 0.09 -0.17 0.19 1.00
Bulk_ESS Tail_ESS
Intercept 5212 5627
zi_Intercept 10949 6527
persistence_dev1 2858 5138
anonymity_dev1 3055 5142
age 16275 12055
femaleTRUE 29563 10725
pol_stance 17068 9763
persistence_dev1:anonymity_dev1 2754 5019
zi_persistence_dev1 20542 11569
zi_anonymity_dev1 19022 11360
zi_age 23044 11719
zi_femaleTRUE 24228 10957
zi_pol_stance 25684 11393
zi_persistence_dev1:anonymity_dev1 17845 10858
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Same results, no main effects, slightly larger but still nonsignificant interaction effect.
Exploratory Analyses
Frequentist
Look at results from a frequentist perspective.
Fixed effects
Estimate nested model.
fit_fe_1_frq <-
lmer(
n_Words ~
1 +
(1 | topic/group) +
persistence_dev * anonymity_dev + age + female + pol_stance
, data = d
)
summary(fit_fe_1_frq)Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: n_Words ~ 1 + (1 | topic/group) + persistence_dev * anonymity_dev +
age + female + pol_stance
Data: d
REML criterion at convergence: 15030
Scaled residuals:
Min 1Q Median 3Q Max
-1.208 -0.507 -0.235 0.221 13.409
Random effects:
Groups Name Variance Std.Dev.
group:topic (Intercept) 13591 117
topic (Intercept) 0 0
Residual 381808 618
Number of obs: 960, groups: group:topic, 48; topic, 3
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 277.629 104.780 910.923 2.65 0.0082 **
persistence_dev1 -28.500 26.099 43.505 -1.09 0.2808
anonymity_dev1 -24.313 26.260 44.563 -0.93 0.3595
age 3.663 1.760 950.062 2.08 0.0377 *
femaleTRUE -60.301 43.364 946.729 -1.39 0.1647
pol_stance 0.468 10.152 949.443 0.05 0.9632
persistence_dev1:anonymity_dev1 6.706 26.168 43.950 0.26 0.7990
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) prss_1 anny_1 age fmTRUE pl_stn
prsstnc_dv1 0.018
annymty_dv1 0.063 0.000
age -0.758 -0.011 -0.058
femaleTRUE -0.468 -0.016 0.049 0.251
pol_stance -0.546 -0.011 -0.067 -0.002 0.063
prsstn_1:_1 0.023 0.001 -0.002 -0.053 -0.039 0.045
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')
Quite weird that topic doesn’t get any variance at all. Perhaps due to small cluster size? With Bayesian estimation, it worked alright. Also, again no significant effects.
Estimate without nesting.
fit_fe_2_frq <-
lmer(
n_Words ~
1 +
(1 | group) +
persistence_dev * anonymity_dev + age + female + pol_stance + topic
, data = d
)
summary(fit_fe_2_frq)Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: n_Words ~ 1 + (1 | group) + persistence_dev * anonymity_dev +
age + female + pol_stance + topic
Data: d
REML criterion at convergence: 15009
Scaled residuals:
Min 1Q Median 3Q Max
-1.200 -0.508 -0.233 0.219 13.366
Random effects:
Groups Name Variance Std.Dev.
group (Intercept) 14508 120
Residual 381804 618
Number of obs: 960, groups: group, 48
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 267.434 111.482 618.730 2.40 0.017 *
persistence_dev1 -28.506 26.462 41.531 -1.08 0.288
anonymity_dev1 -24.265 26.622 42.515 -0.91 0.367
age 3.680 1.762 948.098 2.09 0.037 *
femaleTRUE -59.385 43.404 944.642 -1.37 0.172
pol_stance 0.324 10.160 947.596 0.03 0.975
topicgender -13.401 64.870 41.660 -0.21 0.837
topicmigration 42.524 64.842 41.590 0.66 0.516
persistence_dev1:anonymity_dev1 6.660 26.531 41.946 0.25 0.803
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) prss_1 anny_1 age fmTRUE pl_stn tpcgnd tpcmgr
prsstnc_dv1 0.016
annymty_dv1 0.059 0.000
age -0.722 -0.011 -0.057
femaleTRUE -0.435 -0.016 0.048 0.250
pol_stance -0.505 -0.010 -0.066 -0.003 0.064
topicgender -0.285 0.000 -0.002 0.020 -0.028 -0.024
topicmigrtn -0.300 0.000 -0.001 0.028 0.000 -0.018 0.500
prsstn_1:_1 0.022 0.001 -0.002 -0.052 -0.039 0.044 -0.001 -0.002
Also shows no significant effects.
For curiosity, estimate also without hierarchical structure.
fit_fe_3_frq <-
lm(
n_Words ~
1 +
persistence_dev * anonymity_dev + topic + age + female + pol_stance
, data = d
)
summary(fit_fe_3_frq)
Call:
lm(formula = n_Words ~ 1 + persistence_dev * anonymity_dev +
topic + age + female + pol_stance, data = d)
Residuals:
Min 1Q Median 3Q Max
-604 -327 -150 125 8448
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 239.86 108.03 2.22 0.027 *
persistence_dev1 -28.61 20.28 -1.41 0.159
anonymity_dev1 -25.20 20.49 -1.23 0.219
topicgender -13.59 49.74 -0.27 0.785
topicmigration 42.39 49.71 0.85 0.394
age 3.91 1.77 2.21 0.027 *
femaleTRUE -60.28 43.67 -1.38 0.168
pol_stance 3.77 10.21 0.37 0.712
persistence_dev1:anonymity_dev1 6.94 20.37 0.34 0.733
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 628 on 951 degrees of freedom
Multiple R-squared: 0.0139, Adjusted R-squared: 0.00557
F-statistic: 1.67 on 8 and 951 DF, p-value: 0.101
Also here, no significant effects.
Gender
As preregistered, let’s see if effects differ across genders.
fit_fe_gen <-
hush(
brm(
n_Words ~
1 + persistence_dev * anonymity_dev * gender + age + pol_stance +
(1 | topic/group)
, data = d
, chains = 4
, cores = 4
, iter = 8000
, warmup = 2000
, family = zero_inflated_poisson()
, control = list(
adapt_delta = .95
, max_treedepth = 12
)
)
)Warning: There were 815 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
Warning: There were 1125 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 12. See
https://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
Warning: Examine the pairs() plot to diagnose sampling problems
Again, some warnings.
Let’s inspect model.
plot(fit_fe_gen, ask = FALSE)Traceplots look alright.
Let’s look at results.
summary(fit_fe_gen)Warning: There were 815 divergent transitions after warmup. Increasing
adapt_delta above 0.95 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
Family: zero_inflated_poisson
Links: mu = log; zi = identity
Formula: n_Words ~ 1 + persistence_dev * anonymity_dev * gender + age + pol_stance + (1 | topic/group)
Data: d (Number of observations: 960)
Draws: 4 chains, each with iter = 8000; warmup = 2000; thin = 1;
total post-warmup draws = 24000
Multilevel Hyperparameters:
~topic (Number of levels: 3)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.16 0.21 0.00 0.74 1.00 2230 1189
~topic:group (Number of levels: 48)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.36 0.04 0.29 0.45 1.00 5390 8676
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI
Intercept 5.66 0.12 5.41 5.91
persistence_dev1 0.01 0.05 -0.09 0.12
anonymity_dev1 -0.10 0.05 -0.21 -0.00
gendermale 0.11 0.00 0.11 0.12
age 0.01 0.00 0.01 0.01
pol_stance -0.02 0.00 -0.02 -0.01
persistence_dev1:anonymity_dev1 0.04 0.05 -0.07 0.14
persistence_dev1:gendermale -0.12 0.00 -0.12 -0.11
anonymity_dev1:gendermale 0.13 0.00 0.12 0.13
persistence_dev1:anonymity_dev1:gendermale 0.04 0.00 0.04 0.05
Rhat Bulk_ESS Tail_ESS
Intercept 1.00 4483 1383
persistence_dev1 1.00 6205 8840
anonymity_dev1 1.00 6211 7914
gendermale 1.00 20263 13441
age 1.00 22439 18918
pol_stance 1.00 21345 13579
persistence_dev1:anonymity_dev1 1.00 5838 6616
persistence_dev1:gendermale 1.00 14006 12104
anonymity_dev1:gendermale 1.00 16834 13056
persistence_dev1:anonymity_dev1:gendermale 1.00 19399 13871
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
zi 0.21 0.01 0.18 0.23 1.00 16050 12599
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Indeed, several gender effects.
- For females, the effect of persistence is larger, that is more positive.
- For females, the effect of anonymity is smaller, that is more negative.
- For females, the interaction effect is also a bit smaller, that is more negative.
Let’s visualize results.
p_gen <- plot(
conditional_effects(
fit_fe_gen
),
ask = FALSE,
plot = FALSE
)
p_gen_pers <-
p_gen[["persistence_dev:gender"]] +
xlab("Persistence") +
ylab("Words") +
# scale_y_continuous(
# limits = c(4, 15),
# breaks = c(5, 7.5, 10, 12.5, 15)
# ) +
scale_x_discrete(
limits = rev
) +
guides(
fill = "none"
, color = "none"
)
p_gen_anon <-
p_gen[["anonymity_dev:gender"]] +
xlab("Anonymity") +
ylab("Words") +
# scale_y_continuous(
# limits = c(3.5, 15),
# breaks = c(5, 7.5, 10, 12.5, 15)
# ) +
theme(
axis.title.y = element_blank()
) +
guides(
fill = "none"
) +
scale_x_discrete(
limits = rev
) +
scale_color_discrete(
name = "Gender"
)
plot_gen <- cowplot::plot_grid(
p_gen_pers, p_gen_anon,
labels = c('A', 'B'),
nrow = 1,
rel_widths = c(4, 5)
)
plot_genggsave("figures/results_gen.png", plot_gen, width = 8, height = 4)Benefits
Let’s see if benefits differ across experimental groups.
We first look at the experimental group’s descriptives
d |>
group_by(persistence) |>
summarize(benefits_m = mean(benefits, na.rm = TRUE)) |>
as.data.frame() |>
kable()| persistence | benefits_m |
|---|---|
| high | 3.12 |
| low | 3.23 |
Looking at persistence, we see people with lower persistence reporting slightly higher benefits.
d |>
group_by(anonymity) |>
summarize(benefits_m = mean(benefits, na.rm = TRUE)) |>
as.data.frame() |>
kable()| anonymity | benefits_m |
|---|---|
| high | 3.15 |
| low | 3.20 |
Looking at anonymity, we see people with low anonymity reporting marginally higher benefits.
d |>
group_by(persistence, anonymity) |>
summarize(benefits_m = mean(benefits, na.rm = T)) |>
as.data.frame() |>
kable()`summarise()` has grouped output by 'persistence'. You can override using the
`.groups` argument.
| persistence | anonymity | benefits_m |
|---|---|---|
| high | high | 3.07 |
| high | low | 3.18 |
| low | high | 3.22 |
| low | low | 3.23 |
Looking at both groups combined, we see that low anonymity and low persistence yielded highest benefits.
Let’s look if effects are significant.
fit_fe_ben_1 <-
hush(
brm(
benefits ~
1 + persistence_dev * anonymity_dev + age + female + pol_stance +
(1 | topic/group)
, data = d
, chains = 4
, cores = 4
, iter = 6000
, warmup = 2000
, control = list(
adapt_delta = .95
, max_treedepth = 12
)
)
)Warning: Rows containing NAs were excluded from the model.
Warning: There were 119 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
Warning: Examine the pairs() plot to diagnose sampling problems
Let’s inspect model.
plot(fit_fe_ben_1, ask = FALSE)Traceplots look alright.
Let’s look at results.
summary(fit_fe_ben_1)Warning: There were 119 divergent transitions after warmup. Increasing
adapt_delta above 0.95 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
Family: gaussian
Links: mu = identity; sigma = identity
Formula: benefits ~ 1 + persistence_dev * anonymity_dev + age + female + pol_stance + (1 | topic/group)
Data: d (Number of observations: 705)
Draws: 4 chains, each with iter = 6000; warmup = 2000; thin = 1;
total post-warmup draws = 16000
Multilevel Hyperparameters:
~topic (Number of levels: 3)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.13 0.18 0.00 0.67 1.00 2834 1761
~topic:group (Number of levels: 48)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.08 0.05 0.00 0.18 1.00 3508 4024
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat
Intercept 3.22 0.17 2.87 3.55 1.00
persistence_dev1 -0.05 0.03 -0.11 0.01 1.00
anonymity_dev1 -0.03 0.03 -0.09 0.03 1.00
age -0.00 0.00 -0.01 0.00 1.00
femaleTRUE -0.07 0.06 -0.19 0.04 1.00
pol_stance 0.02 0.01 -0.01 0.04 1.00
persistence_dev1:anonymity_dev1 -0.02 0.03 -0.08 0.04 1.00
Bulk_ESS Tail_ESS
Intercept 5256 3238
persistence_dev1 13714 8987
anonymity_dev1 15562 9606
age 22366 11104
femaleTRUE 13893 9395
pol_stance 16363 11427
persistence_dev1:anonymity_dev1 11999 8868
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 0.74 0.02 0.70 0.78 1.00 15218 10579
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
No significant effects. But note that effect of persistence on perceived benefits only marginally not significant.
Costs
Let’s see if perceived differed across experimental groups.
We first look at the experimental group’s descriptives
d |>
group_by(persistence) |>
summarize(costs = mean(costs, na.rm = TRUE)) |>
as.data.frame() |>
kable()| persistence | costs |
|---|---|
| high | 1.99 |
| low | 1.99 |
Looking at persistence, we see both groups report equal costs.
d |>
group_by(anonymity) |>
summarize(costs = mean(costs, na.rm = TRUE)) |>
as.data.frame() |>
kable()| anonymity | costs |
|---|---|
| high | 1.89 |
| low | 2.09 |
Looking at anonymity, we see people with low anonymity report slightly higher costs.
d |>
group_by(persistence, anonymity) |>
summarize(costs = mean(costs, na.rm = TRUE)) |>
as.data.frame() |>
kable()`summarise()` has grouped output by 'persistence'. You can override using the
`.groups` argument.
| persistence | anonymity | costs |
|---|---|---|
| high | high | 1.90 |
| high | low | 2.07 |
| low | high | 1.87 |
| low | low | 2.11 |
Looking at both groups combined, we see that highest costs were reported by group with low anonymity and low persistence.
Let’s look if effects are significant.
fit_fe_costs_1 <-
hush(
brm(
costs ~
1 + persistence_dev * anonymity_dev + age + female + pol_stance +
(1 | topic/group)
, data = d
, chains = 4
, cores = 4
, iter = 8000
, warmup = 2000
, control = list(
adapt_delta = .95
, max_treedepth = 12
)
)
)Warning: Rows containing NAs were excluded from the model.
Warning: There were 119 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
Warning: Examine the pairs() plot to diagnose sampling problems
Let’s inspect model.
plot(fit_fe_costs_1, ask = FALSE)Traceplots look alright.
Let’s look at results.
summary(fit_fe_costs_1)Warning: There were 119 divergent transitions after warmup. Increasing
adapt_delta above 0.95 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
Family: gaussian
Links: mu = identity; sigma = identity
Formula: costs ~ 1 + persistence_dev * anonymity_dev + age + female + pol_stance + (1 | topic/group)
Data: d (Number of observations: 705)
Draws: 4 chains, each with iter = 8000; warmup = 2000; thin = 1;
total post-warmup draws = 24000
Multilevel Hyperparameters:
~topic (Number of levels: 3)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.14 0.22 0.00 0.79 1.00 4391 3051
~topic:group (Number of levels: 48)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.07 0.05 0.00 0.17 1.00 7582 9602
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat
Intercept 2.48 0.20 2.09 2.87 1.00
persistence_dev1 0.00 0.03 -0.06 0.07 1.00
anonymity_dev1 -0.08 0.03 -0.15 -0.02 1.00
age -0.01 0.00 -0.02 -0.01 1.00
femaleTRUE 0.01 0.07 -0.12 0.14 1.00
pol_stance -0.00 0.02 -0.04 0.03 1.00
persistence_dev1:anonymity_dev1 0.02 0.03 -0.05 0.09 1.00
Bulk_ESS Tail_ESS
Intercept 7138 4975
persistence_dev1 27961 11041
anonymity_dev1 31334 16273
age 40467 19477
femaleTRUE 36893 17982
pol_stance 42598 16990
persistence_dev1:anonymity_dev1 31312 16982
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 0.84 0.02 0.80 0.89 1.00 35120 17084
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
We find that anonymity does reduce costs.
Mediation
Let’s see if perceived benefits and costs were associated with increased words communicated.
fit_fe_med <-
hush(
brm(
n_Words ~
1 + persistence_dev * anonymity_dev + benefits + costs + age + female + pol_stance +
(1 | topic/group)
, data = d
, chains = 4
, cores = 4
, iter = 6000
, warmup = 2000
, family = zero_inflated_poisson()
, control = list(
adapt_delta = .95
, max_treedepth = 12
)
)
)Warning: Rows containing NAs were excluded from the model.
Warning: There were 396 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
Warning: There were 1733 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 12. See
https://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
Warning: Examine the pairs() plot to diagnose sampling problems
Let’s look at results.
summary(fit_fe_med)Warning: There were 396 divergent transitions after warmup. Increasing
adapt_delta above 0.95 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
Family: zero_inflated_poisson
Links: mu = log; zi = identity
Formula: n_Words ~ 1 + persistence_dev * anonymity_dev + benefits + costs + age + female + pol_stance + (1 | topic/group)
Data: d (Number of observations: 705)
Draws: 4 chains, each with iter = 6000; warmup = 2000; thin = 1;
total post-warmup draws = 16000
Multilevel Hyperparameters:
~topic (Number of levels: 3)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.20 0.25 0.01 0.92 1.00 2579 4250
~topic:group (Number of levels: 48)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.39 0.04 0.32 0.48 1.00 3713 5225
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat
Intercept 5.99 0.15 5.67 6.31 1.00
persistence_dev1 -0.04 0.06 -0.15 0.07 1.00
anonymity_dev1 -0.04 0.06 -0.15 0.08 1.00
benefits 0.06 0.00 0.06 0.07 1.00
costs -0.14 0.00 -0.14 -0.14 1.00
age 0.01 0.00 0.01 0.01 1.00
femaleTRUE -0.09 0.00 -0.10 -0.09 1.00
pol_stance -0.01 0.00 -0.02 -0.01 1.00
persistence_dev1:anonymity_dev1 0.05 0.06 -0.06 0.16 1.00
Bulk_ESS Tail_ESS
Intercept 5407 4531
persistence_dev1 4451 5678
anonymity_dev1 4462 5930
benefits 14066 8723
costs 13336 8776
age 17268 13032
femaleTRUE 13331 8887
pol_stance 14543 9796
persistence_dev1:anonymity_dev1 4646 5518
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
zi 0.08 0.01 0.06 0.10 1.00 13751 9148
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
We find that increased perceived costs are associated with decreased words communicated. Increased benefits are associated with increased words communicated. Let’s check if overall effect is significant.
anon_costs_a_b <- fixef(fit_fe_costs_1)["anonymity_dev1", "Estimate"]
anon_costs_a_se <- fixef(fit_fe_costs_1)["anonymity_dev1", "Est.Error"]
anon_costs_a_dis <- rnorm(10000, anon_costs_a_b, anon_costs_a_se)
anon_costs_b_b <- fixef(fit_fe_med)["benefits", "Estimate"]
anon_costs_b_se <- fixef(fit_fe_med)["benefits", "Est.Error"]
anon_costs_b_dis <- rnorm(10000, anon_costs_b_b, anon_costs_b_se)
anon_costs_ab_dis <- anon_costs_a_dis * anon_costs_b_dis
anon_costs_ab_m <- median(anon_costs_ab_dis)
anon_costs_ab_ll <- quantile(anon_costs_ab_dis, .025)
anon_costs_ab_ul <- quantile(anon_costs_ab_dis, .975)The effect is significant (b = -0.01, 95% MC CI [-0.01, 0]).
Save
save.image("data/image_words.RData")